AITopics | Watertown

Collaborating Authors

Watertown

Fact-checking AI-generated news reports: Can LLMs catch their own lies?

arXiv.org Artificial IntelligenceMar-23-2025

In this paper, we evaluate the ability of Large Language Models (LLMs) to assess the veracity of claims in ''news reports'' generated by themselves or other LLMs. Our goal is to determine whether LLMs can effectively fact-check their own content, using methods similar to those used to verify claims made by humans. Our findings indicate that LLMs are more effective at assessing claims in national or international news stories than in local news stories, better at evaluating static information than dynamic information, and better at verifying true claims compared to false ones. We hypothesize that this disparity arises because the former types of claims are better represented in the training data. Additionally, we find that incorporating retrieved results from a search engine in a Retrieval-Augmented Generation (RAG) setting significantly reduces the number of claims an LLM cannot assess. However, this approach also increases the occurrence of incorrect assessments, partly due to irrelevant or low-quality search results. This diagnostic study highlights the need for future research on fact-checking machine-generated reports to prioritize improving the precision and relevance of retrieved information to better support fact-checking efforts. Furthermore, claims about dynamic events and local news may require human-in-the-loop fact-checking systems to ensure accuracy and reliability.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2503.18293

Country:

North America > United States > Florida > Miami-Dade County > Miami (0.05)
Asia > Thailand > Bangkok > Bangkok (0.04)
North America > United States > Massachusetts > Middlesex County > Watertown (0.04)
(16 more...)

Genre: Research Report > New Finding (0.88)

Industry:

Media > News (1.00)
Leisure & Entertainment > Sports > Basketball (1.00)
Government > Regional Government > North America Government > United States Government (0.93)
Leisure & Entertainment > Sports > Olympic Games (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.53)

Add feedback

Kernels of Selfhood: GPT-4o shows humanlike patterns of cognitive consistency moderated by free choice

Lehr, Steven A., Saichandran, Ketan S., Harmon-Jones, Eddie, Vitali, Nykko, Banaji, Mahzarin R.

arXiv.org Artificial IntelligenceJan-26-2025

Large Language Models (LLMs) have surprised the scientific community and even their creators by exhibiting emergent abilities once thought to be uniquely human, such as advanced cognition and reasoning (1-6), although the full extent of these accomplishments is debated (3, 7-10). These capabilities align with the rational and deliberative aspects of human nature, but humans are not purely rational creatures, and it is unclear whether LLMs will mimic a broader spectrum of human psychological tendencies. Here we test whether OpenAI's GPT-4o replicates behaviors associated with the human tendency toward cognitive consistency as well as human sensitivity to choice, characterized by greater attitude shifts when the behaviors inducing these changes are freely chosen. Decades of research demonstrate that humans will irrationally twist their attitudes to align with behaviors they were induced to perform. For example, consider an individual who opposes single-payer healthcare, but volunteers, in response to a request for help, to craft an argument in favor of the policy. Rationally, this individual's attitude toward single-payer healthcare should not move in a more supportive direction; they should be able to discriminate between their genuine attitude and the opposing one that they have articulated only to be helpful.

large language model, machine learning, no-choice condition, (22 more...)

arXiv.org Artificial Intelligence

2502.07088

Country:

Asia > Russia (0.48)
Asia > China (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
(6 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Government > Regional Government > Asia Government (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.34)

Add feedback

Biomechanical Comparison of Human Walking Locomotion on Solid Ground and Sand

Zhu, Chunchu, Chen, Xunjie, Yi, Jingang

arXiv.org Artificial IntelligenceJun-28-2024

Current studies on human locomotion focus mainly on solid ground walking conditions. In this paper, we present a biomechanic comparison of human walking locomotion on solid ground and sand. A novel dataset containing 3-dimensional motion and biomechanical data from 20 able-bodied adults for locomotion on solid ground and sand is collected. We present the data collection methods and report the sensor data along with the kinematic and kinetic profiles of joint biomechanics. A comprehensive analysis of human gait and joint stiffness profiles is presented. The kinematic and kinetic analysis reveals that human walking locomotion on sand shows different ground reaction forces and joint torque profiles, compared with those patterns from walking on solid ground. These gait differences reflect that humans adopt motion control strategies for yielding terrain conditions such as sand. The dataset also provides a source of locomotion data for researchers to study human activity recognition and assistive devices for walking on different terrains.

locomotion, solid ground, terrain, (16 more...)

arXiv.org Artificial Intelligence

2403.03105

Country:

North America > United States > New Jersey > Middlesex County > Piscataway (0.14)
South America > Bolivia (0.04)
North America > United States > Massachusetts > Middlesex County > Watertown (0.04)
(3 more...)

Genre: Research Report > New Finding (0.95)

Industry: Health & Medicine > Consumer Health (0.46)

Technology: Information Technology > Artificial Intelligence > Robots (0.94)

Add feedback

ChatGPT as Research Scientist: Probing GPT's Capabilities as a Research Librarian, Research Ethicist, Data Generator and Data Predictor

Lehr, Steven A., Caliskan, Aylin, Liyanage, Suneragiri, Banaji, Mahzarin R.

arXiv.org Artificial IntelligenceJun-20-2024

How good a research scientist is ChatGPT? We systematically probed the capabilities of GPT-3.5 and GPT-4 across four central components of the scientific process: as a Research Librarian, Research Ethicist, Data Generator, and Novel Data Predictor, using psychological science as a testing field. In Study 1 (Research Librarian), unlike human researchers, GPT-3.5 and GPT-4 hallucinated, authoritatively generating fictional references 36.0% and 5.4% of the time, respectively, although GPT-4 exhibited an evolving capacity to acknowledge its fictions. In Study 2 (Research Ethicist), GPT-4 (though not GPT-3.5) proved capable of detecting violations like p-hacking in fictional research protocols, correcting 88.6% of blatantly presented issues, and 72.6% of subtly presented issues. In Study 3 (Data Generator), both models consistently replicated patterns of cultural bias previously discovered in large language corpora, indicating that ChatGPT can simulate known results, an antecedent to usefulness for both data generation and skills like hypothesis generation. Contrastingly, in Study 4 (Novel Data Predictor), neither model was successful at predicting new results absent in their training data, and neither appeared to leverage substantially new information when predicting more versus less novel outcomes. Together, these results suggest that GPT is a flawed but rapidly improving librarian, a decent research ethicist already, capable of data generation in simple domains with known characteristics but poor at predicting novel patterns of empirical data to aid future experimentation.

gpt, gpt-3, gpt-4, (15 more...)

arXiv.org Artificial Intelligence

2406.14765

Country:

North America > United States > Washington > King County > Seattle (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Massachusetts > Middlesex County > Watertown (0.04)
(9 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Law (0.92)
Health & Medicine > Therapeutic Area (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Talaria: Interactively Optimizing Machine Learning Models for Efficient Inference

Hohman, Fred, Wang, Chaoqun, Lee, Jinmook, Görtler, Jochen, Moritz, Dominik, Bigham, Jeffrey P, Ren, Zhile, Foret, Cecile, Shan, Qi, Zhang, Xiaoyi

arXiv.org Artificial IntelligenceApr-3-2024

On-device machine learning (ML) moves computation from the cloud to personal devices, protecting user privacy and enabling intelligent user experiences. However, fitting models on devices with limited resources presents a major technical challenge: practitioners need to optimize models and balance hardware metrics such as model size, latency, and power. To help practitioners create efficient ML models, we designed and developed Talaria: a model visualization and optimization system. Talaria enables practitioners to compile models to hardware, interactively visualize model statistics, and simulate optimizations to test the impact on inference metrics. Since its internal deployment two years ago, we have evaluated Talaria using three methodologies: (1) a log analysis highlighting its growth of 800+ practitioners submitting 3,600+ models; (2) a usability survey with 26 users assessing the utility of 20 Talaria features; and (3) a qualitative interview with the 7 most active users about their experience using Talaria.

optimization, practitioner, talaria, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3613904.3642628

2404.03085

Country:

North America > United States > Hawaii > Honolulu County > Honolulu (0.05)
North America > United States > Washington > King County > Seattle (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
(8 more...)

Genre:

Research Report (1.00)
Questionnaire & Opinion Survey (1.00)

Industry: Information Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

CEO: Corpus-based Open-Domain Event Ontology Induction

Xu, Nan, Zhang, Hongming, Chen, Jianshu

arXiv.org Artificial IntelligenceMay-22-2023

Existing event-centric NLP models often only apply to the pre-defined ontology, which significantly restricts their generalization capabilities. This paper presents CEO, a novel Corpus-based Event Ontology induction model to relax the restriction imposed by pre-defined event ontologies. Without direct supervision, CEO leverages distant supervision from available summary datasets to detect corpus-wise salient events and exploits external event knowledge to force events within a short distance to have close embeddings. Experiments on three popular event datasets show that the schema induced by CEO has better coverage and higher accuracy than previous methods. Moreover, CEO is the first event ontology induction model that can induce a hierarchical event ontology with meaningful names on eleven open-domain corpora, making the induced schema more trustworthy and easier to be further curated.

artificial intelligence, gpt-j-6b, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2305.13521

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.28)
North America > United States > Washington > King County > Seattle (0.04)
South America > Brazil (0.04)
(21 more...)

Genre: Research Report (0.40)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Energy (1.00)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.69)

Add feedback

Subtyping patients with chronic disease using longitudinal BMI patterns

Mottalib, Md Mozaharul, Jones-Smith, Jessica C, Sheridan, Bethany, Beheshti, Rahmatollah

arXiv.org Artificial IntelligenceFeb-7-2023

Obesity is a major health problem, increasing the risk of various major chronic diseases, such as diabetes, cancer, and stroke. While the role of obesity identified by cross-sectional BMI recordings has been heavily studied, the role of BMI trajectories is much less explored. In this study, we use a machine-learning approach to subtype individuals' risk of developing 18 major chronic diseases by using their BMI trajectories extracted from a large and geographically diverse EHR dataset capturing the health status of around two million individuals for a period of six years. We define nine new interpretable and evidence-based variables based on the BMI trajectories to cluster the patients into subgroups using the k-means clustering method. We thoroughly review each cluster's characteristics in terms of demographic, socioeconomic, and physiological measurement variables to specify the distinct properties of the patients in the clusters. In our experiments, the direct relationship of obesity with diabetes, hypertension, Alzheimer's, and dementia has been re-established and distinct clusters with specific characteristics for several of the chronic diseases have been found to be conforming or complementary to the existing body of knowledge.

artificial intelligence, machine learning, trajectory, (20 more...)

arXiv.org Artificial Intelligence

2111.05385

Country:

North America > United States > Washington > King County > Seattle (0.14)
North America > United States > Delaware > New Castle County > Newark (0.14)
Europe > Middle East > Malta > Northern Region > Western District > Attard (0.04)
(3 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Forecasting labels under distribution-shift for machine-guided sequence design

Wheelock, Lauren Berk, Malina, Stephen, Gerold, Jeffrey, Sinai, Sam

arXiv.org Artificial IntelligenceNov-18-2022

The ability to design and optimize biological sequences with specific functionalities would unlock enormous value in technology and healthcare. In recent years, machine learning-guided sequence design has progressed this goal significantly, though validating designed sequences in the lab or clinic takes many months and substantial labor. It is therefore valuable to assess the likelihood that a designed set contains sequences of the desired quality (which often lies outside the label distribution in our training data) before committing resources to an experiment. Forecasting, a prominent concept in many domains where feedback can be delayed (e.g. elections), has not been used or studied in the context of sequence design. Here we propose a method to guide decision-making that forecasts the performance of high-throughput libraries (e.g. containing $10^5$ unique variants) based on estimates provided by models, providing a posterior for the distribution of labels in the library. We show that our method outperforms baselines that naively use model scores to estimate library performance, which are the only tool available today for this purpose.

artificial intelligence, machine learning, modeling & simulation, (20 more...)

arXiv.org Artificial Intelligence

2211.10422

Country:

Europe > France (0.05)
North America > United States > Massachusetts > Middlesex County > Watertown (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Modeling & Simulation (0.94)

Add feedback

Learning the shape of protein micro-environments with a holographic convolutional neural network

Pun, Michael N., Ivanov, Andrew, Bellamy, Quinn, Montague, Zachary, LaMont, Colin, Bradley, Philip, Otwinowski, Jakub, Nourmohammad, Armita

arXiv.org Artificial IntelligenceNov-5-2022

Proteins play a central role in biology from immune recognition to brain activity. While major advances in machine learning have improved our ability to predict protein structure from sequence, determining protein function from structure remains a major challenge. Here, we introduce Holographic Convolutional Neural Network (H-CNN) for proteins, which is a physically motivated machine learning approach to model amino acid preferences in protein structures. H-CNN reflects physical interactions in a protein structure and recapitulates the functional information stored in evolutionary data. H-CNN accurately predicts the impact of mutations on protein function, including stability and binding of protein complexes. Our interpretable computational model for protein structure-function maps could guide design of novel proteins with desired function.

amino acid, artificial intelligence, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2211.02936

Country:

North America > United States > Washington > King County > Seattle (0.04)
North America > United States > Massachusetts > Middlesex County > Watertown (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
Europe > Germany > Lower Saxony > Gottingen (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Immunology (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Edge.org

#artificialintelligenceMar-27-2022, 05:47:14 GMT

The conversation is on hold. The Edge community has hit the road... or they're staying home. Preparing for the academic year to begin, wrapping up projects and starting new ones, celebrating with family and friends or contemplating in solitude. After a hiatus, Edge is pleased to revive Summer Postcards: Edgies reporting in from wherever they are and on whatever they're doing, as the dog days wind out and the season comes to a close. As the world slowly returns to a "new normal" with enduring COVID restrictions in the midst of renewed vaccine freedoms, this year's collection is a testament to change (temporary and lasting), a consideration of loss (will travel ever be like it was?), and a celebration of questions (that still need answering). The hammock may be away until next year, but the memories remain. I spent the summer writing and revising the final section of a longish novel I started in 2019. It seems now as though I've been from 1946 to 2021 on my hands and knees. Various lockdowns have been a liberation from obligations and the luggage carousel, and I've never known such sweet and total focus for months on end. We have the luxury of living in the country--no shortage of big skies and moody walks. All our few breaks were in the UK--Scotland, the Lake District, the West country. Even in our remote part of the Lakes, I had to keep on writing--as in photo. The best novel I read this summer was Sandro Veronesi's The Hummingbird. Best non-fiction was Peter Godfrey Smith's Metazoa: Animal Life and the Birth of the Mind. I gave time also to some wonderful novellas--perfect fictional form for you too-busy scientists. IAN MCEWAN is a novelist whose works have earned him worldwide critical acclaim. He is the recipient of the Man Booker Prize for Amsterdam (1998), the National Book Critics' Circle Fiction Award, and the Los Angeles Times Prize for Fiction for Atonement (2003). His most recent novel is Machines Like Me. In 2019, Časlav Brukner and myself were walking on a beach on Lamma Island, near Hong Kong, marvelling together at the astonishing strangeness of quantum phenomena. This summer, the conversation with Časlav has continued on another island, and quite an island: Lesbos, the northern Greek island near the Turkish coast. Lesbos is the place where lyrical poetry was born. Here lived Sappho and Alcaeus.

consciousness, physics, university, (16 more...)

#artificialintelligence

Country:

Europe > United Kingdom > Scotland (0.24)
North America > United States > California > Los Angeles County > Los Angeles (0.24)
Europe > Netherlands > North Holland > Amsterdam (0.24)
(44 more...)

Genre: Personal > Honors (0.86)

Industry:

Media (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.93)
Education > Educational Setting (0.93)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Communications (0.93)
Information Technology > Artificial Intelligence > Machine Learning (0.93)

Add feedback